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Abstract 



An XML framework for concept description is given, based upon the 
fact that the tree structure of XML implies the logical structure of concepts 
as defined by attributional calculus. Especially, the attribute-value repre- 
sentation is implcmentable in the XML framework. Since the attribute- 
value representation is an important way to represent knowledge in AI, 
the framework offers a further and simpler way than the powerful RDF 
■ technology. 
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^- ■ 1 Introduction 

O 

c/2 ■ Knowledge representation is an important and wide area of Artificial Intelli- 

gence. The simplest way to represent knowledge of an object is the attribute- 
value representation. Here an object is characterized by its attributes, each of 
^ ■ which having a fixed range of value. A concept then is a category of objects 

specified by a logical combination of attribute values. 

On the other hand an XML document consists of elements specified by 
attribute- value pairs and nested in a hierarchical tree structure. Hence the idea 
to apply XML to knowledge representation via the attribute-value representa- 
tion is straightforward. The logical notions of rules and concepts are rather 
naturally mapped into the XML framework, which to demonstrate is the major 
intention of this paper. 

The application of XML to concept description has a lot of advantages. 
Since XML is a universal and web-based data format, it is appropriate for 
platform-independent and world-wide use. XML nowadays has become a widely 
accepted standard data interchange technology so that its general usability is 
guaranteed for a long time. 
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Another aspect is the power of XML to let check data consistency of instance 
documents against the corresponding XML schema. This tears consistency 
checks apart from the applications processing the data. 

The new XML framework supplements different approaches towards knowl- 
edge representation using RDF technology to build a Semantic Web [H SI E, El 
\J\ M, \9\ Q21 13] . But whereas RDF is the more powerful technology enabling 
semantic links by an entity-relationship implementation, the XML framework is 
simpler and more appropriate to a description of concepts based upon attribute- 
value representation as well as to their storage and to data interchange. The 
framework enables and simplifies worldwide data access for subsequent appli- 
cations, e.g. machine learning programs. 

The present paper is organized as follows. In section [2] a sketchy overview of 
XML is given, followed by a short introduction to knowledge representation and 
attributional calculus in section SI The core of the paper is section S modelling 
concept description in XML, section [5] provides Emerald's world as an example 
of this model. A short discussion concludes the paper. 

2 XML 

Among other purposes, XML has been constructed as a data interchange for- 
mat. By definition it is a textual markup language consisting of elements which 
are organized in a tree structure. Syntactically, each element name is opened 
by a start tag, <name>, and closed by an end tag, </name>. To reflect the tree 
structure, any element opened after the start tag of a previous element must 
be closed before the previous element is closed. 

Any element can have m children as well as n attributes, m, n € No (both 
m and n may vanish). An attribute is written in the start tag of the element 
with its value in quotation marks ("), i.e. 

<name attribute=" value" > ... </name> 

If an element in an XML document has no child, it can be written as <name/>, 
i.e. <name> </name> = <name/>. The possible elements of an XML document 
are declared in its DTD (document type definition) or, more generally, in its 
XML schema. An XML schema by itself is an XML document with a predefined 
element set. In particular, the root element of an XML schema is <xsd:schema> 
. . . </xsd:schema>. Here xsd denotes the XML namespace (xmlns) of the W3C 
consortium, 

xmlns : xsd="http : //www . w3 . org/2001/XMLSchema" . 

A typical XML schema looks like the following source code. 

<?xml version="1.0"?> 
<xsd: schema 

xmlns : xsd="http : //www . w3 . org/2001/XMLSchema" 
targetNamespace=" . . . "> 

</xsd: schema> 

For details see \S[\U\. 
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3 Knowledge representation and attributional calcu- 
lus 



Attributional calculus pj] is a simple description language whose representa- 
tional power is between propositional calculus and first order predicate logic. It 
serves as a technique for knowledge representation. In the sequel a set-theoretic 
version of attributional calculus is presented, briefly compared to the standard 
VL1 notation. 

Let X be a given finite set of objects. An attribute then is a mapping a: X 
— > W, x — > a(x), where W is a given finite set, the range of the attribute values 
of the objects x in X. Therefore, a{x) is the value of the attribute that object 
x possesses. Let the objects x in X be uniquely characterized by a finite set 
of attributes A = {a±, . . . ,a n }, where each attribute has a fixed finite range 
Wi = ai(X), i = 1, . . . , n. Then each attribute can be written as 

at : X — > Wi, x i— > ai(x). (i = 1, . . . , n) (1) 

Thus by this choice of attributes an object x is well distinguishable from the 
others by all its attribute values ai(x), . . . , a n (x). In other words, each object 
x corresponds uniquely to a vector of attribute values wi, . . . , w n , 

x — (wi, . . . , w n ) € W\ x ... x W n . (2) 

The attribute-value representation formalizes the information that we have 
about the object set X and its objects. Hence it is a precise notion for (one 
kind of) knowledge representation. 

The basic construct of attributional calculus is the elementary selector S(a, w) 
given for an attribute-value pair (a,w) by 5: V — > 2 X , (a,w) i— > S(a,w), with 

S(a,iv) = {x S X : a(x) = w}. (3) 

Here 2 X denotes the potential set of X, and T> is the attribute-value set, T> = 
Ur=i({ a *} x Wi). In other words, S(a,w) selects all objects whose attribute a 
has value w. Of course, this set might be empty. An empty selector corresponds 
to a "don't care." Often, a selector S is written a bit sloppily as 

S = {a = w}. (4) 

This is justified by Boole's second law [2, §XII.l]. In VL1 notation [10], a 
general selector is written as [a — > w], where —> is a relation satisfying — > € {=, 

^, >, ^, <}. Since the range W is finite, a general selector is a disjunction 
of elementary selectors. For instance, if W = {wi, W2, ■ ■ ■ , w m }, the selector 
[a ytz w\] is equivalent to [a = W2] V [a = 103] V ... V [a = w m ]. 

A rule R is an intersection of selectors Si, . . . , S m , 

R = Si n . . . n s m , (5) 

where selector Si corresponds to attribute a^. Note that a selector may be 
empty and can be omitted in this case. A concept C is a union of rules Ri, . . . , 

Rk, 

C = Ri U . . . UR k . (6) 
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In VL1 notation, an intersection corresponds to a conjunction, and a union 
corresponds to a disjunction. Analogously to the context of Boolean functions, 
we call this representation the disjunctive normal form [2J § III. 5] of a concept. 
In the next section it will be worked out that this simple structure of concepts 
is naturally represented in XML. 

4 XML model of concepts 

The XML model of concepts relies on the fact that the structure of XML implies 
the structure of attributional calculus. Three implications are immediately 
observed: 

1. a selector is representable by an attribute- value pair, cf. eq. (HJ; 

2. the intersection (conjunction) of selectors corresponds to a list of attribute- 
value pairs in a single element; 

3. the union (disjunction) of rules corresponds to the creation of children to 
a parent element. 

Thus the XML model of concepts is straightforwardly achieved: A concept can 
be considered as an element which has either no, one, or more rules as child 
elements; a selector is simply an attribute- value pair of a rule element. For 
instance, a concept may be given as 

<concept> 

<rule di="Wij" ... a k ="w kl "/> 

<rule a m ="w mn " ... a p ="w pq "/> 
</ concept> 

Here cjj is attribute number i, Wj is its range, and Wij G Wi is one of its 
possible values. Hence a concept can be graphically represented by the following 
diagram. 



con 


;ept 


rule 1 






rule k 


selector Sn 
selector Si ni 


selector Ski 
selector Skn k 



To enable this construct, the corresponding XML schema for a concept has to 
be given as in the following source code. 

<?xml version="1.0"?> 
<xsd: schema 

xmlns : xsd="http : //www . w3 . org/200 1/XMLSchema" 

targetNamespace= 

"http : //www. math-it . org/xml/2002/ concept . xsd" 

xmlns="http : //www .math-it . org/xml/2002/concept .xsd" 

elementFormDef ault=" qualified" 

> 

<xsd:element name="concept"> 
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<xsd: complexType> 
<xsd : sequence> 
<xsd : element name="rule" minOccurs="0" 

maxQccurs= "unbounded "> 

<xsd : complexType> 
<xsd: attribute name="attribute_l" type="W_l"/> 
<xsd: attribute name="attribute_n" type="W_n"/> 
</xsd : complexType> 
</xsd: element> 
</xsd: sequence> 
</xsd : complexType> 
</xsd: element> 
<xsd: simpleType name="W_l"> 
<xsd: restriction base="xsd: string" > 
<xsd: enumeration value="w_ll"/> 
<xsd: enumeration value="w_lm"/> 
</xsd : restrict ion> 
</xsd: simpleType> 
<xsd: simpleType name="W_n"> 
<xsd: restriction base="xsd: string" > 
<xsd: enumeration value="w_nl"/> 
<xsd: enumeration value="w_nk"/> 
</xsd : restrict ion> 
</xsd: simpleType> 
</xsd: schema> 

The names for the attributes, e.g. attributed, as well as for the ranges (data 
types) of the values, e.g. w_i, have to be adjusted appropriately. 

This XML model can be easily extended to enable naming of concepts or 
rules by an adding another child to the concept element, or the rule element, 
respectively. 



5 An example: Emerald's robots 

To illustrate the notion of the concept and its implementation in XML, let us 
consider exemplarily the world of Emerald's robots. 1 It is a software system 
consisting of objects called "robots." Each robot is described by the values w\, 
. . . , wq of six attributes, Wi £ W{. The attributes with their ranges are listed 
in table CD A concept now is a specific description of a robot category. For 



attributes range of values 



headShape 




= {"round" 


, "square", 


"octagon" } 


bodyShape 


w 2 


= {"round" 


, "square", 


"octagon" } 


isSmiling 


w 3 


= {"true", 


"false"} 




holding 




= {"sword" 


, "balloon" 


, "flag"} 


jacketColor 


w 5 


= {"red", 


"yellow", " 


green", "blue"} 


hasTie 


W 6 


= {"yes", 


"no"} 





Table 1: The robots' attributes and their ranges. 



1 EMERALD — Experimental Machine Example-based Reasoning And Learning Disciple, 
see www.mll.gmu.edu; at this URL you also find Java-based animations illustrating the system. 



5 



instance, 

C = "head is round and jacket is red, or head is square and is holding a 

balloon" 

is a concept. There are 3 • 3 • 2 ■ 3 ■ 4 ■ 2 = 432 different robot objects in this world, 
84 of which belong to the category C. In our XML framework, the concept C 
is implemented as 

<concept> 

<rule headShape="round" jacketColor="red"/> 
<rule headShape=" square" holding="balloon"/> 
</ concept> 

The corresponding XML schema is given in the appendix; it can also be found 
in the WWW at the URL 

http : //www. math-it . org/xml/2002/emerald. xsd 

It defines the concept structure as well as the attribute ranges of the robots. 

6 Discussion 

In this paper an XML framework for concept description is proposed. Concepts 
are expressed with the aid of attributional calculus as set-theoretic combinations 
of rules and selectors. An important observation is that the structure of con- 
cepts is implied by the structure of an XML document. In particular, a selector 
is representable by an attribute-value pair, a rule as an intersection (conjunc- 
tion) of selectors by a list of attribute- value pairs in a single element, and a 
union (disjunction) of rules by a generation of children of a parent element. In 
this way, the XML framework for concept descriptions can be developed in a 
straightforward manner. 

As a consequence, the attribute- value representation in this XML framework 
offers a route to represent knowledge, distinct from similar but more powerful 
approaches based, e.g., on the RDF technology SI El [S, ffl, El Si 121 33]. 
Since the framework refers to concepts in the sense of attributional calculus, 
it cannot represent all possible logical connectives. For instance, it does not 
provide recursive structures (the mainstay of RDF where, e.g., the object of 
an RDF property can itself have arbitrary properties) or express inductive con- 
cepts which have only necessary conditions (cf. the concept "human" and the 
classical "featherless biped" example due to Aristotle). In addition, it has no 
quantification, negation, etc., although this could be implemented easily in a 
richer schema for concept definitions, as is done in a wide range of web-based 
knowledge representation languages such as OKBC [12] , DAML+OIL [9], OWL 
[L3], or full FOL [7]. 

Moreover, the attribute- value representation on which the XML framework 
is based upon is only applicable to knowledge systems consisting of solely finite 
ranges of attribute values. 

However, the framework is still rich enough to tackle problems of machine 
learning. Here the emphasis is laid upon the representation of learning exam- 
ples, which are completely determined by their attributes as well as an addi- 
tional Boolean flag indicating whether they are positive or negative. Thus the 
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framework enables to store concepts in XML documents and to let them be 
further processed by concept learning algorithms. Reconciliation of a concept 
document with its corresponding XML schema by standard XML parsers can 
be used as a data consistency check, independently from subsequent processing 
programs. The example of Emerald's robots indicates a perspective how this 
can be done. 

Since moreover XML is a universal data interchange format, a web-based 
storage of concepts could serve as a germ of a standardized world-wide knowl- 
edge database platform. 

Appendix: XML schema of Emerald's world 

<?xml version="1.0"?> 
<xsd: schema 

xmlns : xsd="http : //www . w3 . org/2001/XMLSchema" 

targetNamespace= 

"http : //www .math-it . org/xml/2002/emerald. xsd" 

xmlns="http : //www. math-it . org/xml/2002/emerald. xsd" 

elementFormDef ault=" qualified" 

> 

<xsd: element name=" emerald" > 
<xsd : complexType> 
<xsd: sequence minOccurs="0" maxOccurs="unbounded"> 
<xsd:element name=" concept" minOccurs="0" 

maxOccurs= "unbounded" > 

<xsd : complexType> 
<xsd : sequence> 
<xsd:element name="rule" minOccurs="0" 

maxOccurs= "unbounded" > 

<xsd : complexType> 
<xsd: attribute name="headShape" type="HeadShape"/> 
<xsd: attribute name="bodyShape" type="BodyShape"/> 
<xsd: attribute name="isSmiling" type="IsSmiling"/> 
<xsd: attribute name="holding" type="Holding"/> 
<xsd: attribute name=" jacketColor" type="Color"/> 
<xsd: attribute name="hasTie" type="HasTie"/> 
</xsd : complexType> 
</xsd: element> 
</xsd : sequence> 
</xsd: complexType> 
</xsd: element> 
</xsd: sequence> 
</xsd: complexType> 
</xsd : element> 

<xsd: simpleType name="HeadShape"> 
<xsd:restriction base="xsd: string"> 
<xsd : enumer at ion value= "round " /> 
<xsd : enumerat ion value= " square "/> 
<xsd : enumerat ion value= "octagon " /> 
</xsd: restrict ion> 
</xsd : s impleType> 
<xsd: simpleType name=" Body Shape "> 
<xsd:restriction base="xsd: string"> 
<xsd: enumeration value= "round "/> 
<xsd : enumerat ion value= " square "/> 
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<xsd: enumeration value="octagon"/> 
</xsd: restrict ion> 
</xsd : s impleType> 
<xsd: simpleType name="IsSmiling"> 
<xsd : restriction base="xsd : boolean" > 
<xsd:pattern value="true"/> 
<xsd:pattern value="false"/> 
</xsd: restrict ion> 
</xsd: simpleType> 
<xsd: simpleType name="Holding"> 
<xsd:restriction base="xsd: string"> 
<xsd: enumeration value=" sword "/> 
<xsd: enumeration value="balloon"/> 
<xsd: enumeration value="f lag"/> 
</xsd: restriction> 
</xsd: simpleType> 
<xsd: simpleType name="Color"> 
<xsd:restriction base="xsd: string"> 
<xsd: enumeration value="red"/> 
<xsd: enumeration value="yellow"/> 
<xsd: enumeration value="green"/> 
<xsd: enumeration value="blue"/> 
</xsd: restrict ion> 
</xsd : s impleType> 
<xsd: simpleType name="HasTie"> 
<xsd:restriction base="xsd: string"> 
<xsd: enumeration value="yes"/> 
<xsd: enumeration value="no"/> 
</xsd: restrict ion> 
</xsd: simpleType> 
</xsd: schema> 
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